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Abstract — Reliability demonstration testing for software 
products is performed for the purpose of examining whether 
the specified reliability is realized in the software after the 
development process is completed. This study proposes a model 
of reliability demonstration testing for discrete-type software 
such as software for numerical calculations. The number of 
input data sets for test and acceptance number of input data 
sets causing software failures in the test are designed based 
on variation distance. This model has less parameters to be 
prespecified than the statistical model. 

Index Terms — software reliability, reliability demonstration 
testing, discrete-type software, variation distance 

I. Introduction 

Many studies on estimating software reliability analyze 
data obtained in the development process of a software 
product [1,2]. On the other hand, reliability demonstration 
testing which has been developed for hardware products 
originally [3] should also be applied to software products for 
the purpose of examining whether the specified reliability 
has been attained after the development process [4]. 

We have suggested models for reliability demonstration 
testing of discrete-type software such as software for 
numerical calculations [5]. The number of input data sets for 
test and acceptance number of input data sets causing 
software failures in the test are designed based on the concept 
of a statistical test, which requires us to specify the values of 
producer's and consumer's risks. This study proposes a model 
of reliability demonstration testing for discrete-type software 
on the basis of variation distance [6], which can be regarded 
as a measure to express the distance between two probability 
distributions. This model can design the test more easily 
than the statistical model since this model includes less 
parameters to be prespecified than the statistical model. 

II. Notation and Assumptions 

This study discusses software reliability demonstration 
testing (SPvDT) model for software which is used discretely 
in time. It is convenient to evaluate the reliability of this type 
of software in terms of the probability, p that a software fail- 
ure occurs for an arbitrarily selected input data set. This 
probability is called unreliability of the software in the fol- 
lowing. 

This study considers SRDT, where the software of interest 
is tested with n input data sets, and is accepted if the number 
of input data sets causing software failures in the test does 
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not exceed an integer c and rejected otherwise. In this case, 
the design variables are n (n = 1, 2, ...) and c (c = 0, 1, 2, n- 
1). 

The notation used in this paper is as follows: 
p Unreliability of the software on the contract 
p x Tolerable upper limit for unreliability of the software (p Q < 

Pi) 

E Event that the number of input data sets causing software 
failures in the test exceeds c 

£j Complement of E x 

The assumptions made throughout this paper are listed 
below: 

(i) No fault is removed during the test. All the faults for 
software failures during the test are removed after the 
test is completed. 

(ii) The values of p and p x are specified at the beginning 
of the test. 

III. Statistical Model 

For the purpose of determining the values of n and c in 
the above SRDT, this section presents a model based on the 
concept of a statistical test. 

In the above SRDT, the probability that event E 1 occurs 
when p = p a is given by 



Pr[£, !/>„]=__ 



Po'a-PoT 



(1) 



i=c+\ \ 1 J 



Likewise, the probability that event occurs when p = 
p x is expressed by 



Pr[£i I = £ . p-(l- Pl ) 



(2) 



The probability in Eq. (1) is called a producer's risk in 
SRDT as well as in sampling theory, while that in Eq. (2) is 
called a consumer 's risk. The producer's and consumer's risks, 
respectively, signify the probabilities of Type I and Type II 
errors in terms of a statistical test. 

When the values of the producer's and consumer's risks 
are specified to be equal to or less than « and ft, respectively, 
a feasible region for designing a software reliability 
demonstration test is written as 



Ka-^r <a, 



(3) 
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X . p;{\- Pl r<p. 



(4) 



In general, the simultaneous equations obtained by 
replacing inequalities in Eqs. (3) and (4) by equalities do not 
always have a unique solution since n and c are integers. A 
practical solution would be obtained by using the following 
four conditions: 

(i) The producer 's risk does not exceed a ■ 

(ii) The consumer's risk does not exceed B. 

(iii) The number of input data sets n is the minimum. 

(iv) Acceptance number c is the minimum. 

IV. Variation Distance Model 

This section presents a model based on the variation 
distance with a view to determining the values of n and c in 
the above SRDT 

A. Formulation of Variation Distance 

Let F 1 and F 2 denote two types of discrete probability 
distribution, and q y and q 2 . (i = 0, 1,2, ...) be probability mass 
functions associated with F { and F 2 , respectively. Then the 
variation distance [6] of F l and F 2 is defined by 



d v (F lt F 2 )^\q u 



(5) 



The variation distance in Eq. (5) can be regarded as a 
measure to express the distance between F x and F y In other 
words, as the variation distance increases, we can distinguish 
F from F 2 more easily. 
When p = p , we have 



Pr[E 1 \ Po ]=Y J . Po'd-Po)"- 



Pr[£ 1 lp ] = X . Pod-PoY-- 

i=o V 1 J 

In the case of p = p , we have 



Pr[£ 1 l A ]=£ . 



n 



a' a -ft)" 



(6) 



(7) 



(8) 



(9) 



A set of Eqs. (6) and (7) represents a probability 
distribution F , expressed by a probability mass function q 
(i = 0, 1) in Eq. (5) when p = p Q . A set of Eqs. (8) and (9) 
expresses another probability distribution F 2 , characterized 
by a probability mass function q 2 . (i = 0, 1) in Eq. (5) in the 
case of p = p r Hence, the variation distance of these two 
probability distributions is written as 



pr[£ 1 i A ]= J r P ;a- Pl ) 



d v {F„F 2 ) 



= 2 



Po'V-PoT-'-n 



p 1 i Q-p 1 r i 



(10) 



We can obtain an optimal pair of values, (n, c)* by maximizing 
d(F , F 2 ) in Eq. (10) since we can distinguish F 1 from F 2 more 
easily as the variation distance increases. 

B. An Optimal Value c* for Each Value ofn 

This section shows the existence of the optimal value c* 
for each value of n. 

Let D(c) = d v (F v F 2 ), the difference of D(c) is given by 

D(c + l)-D(c) 



\\pr(\-p a r- c - l -pr^-pr c -\ 



(11) 



The sign of D(c + 1) - D(c) follows that 
(i) If 



nlog 



1- 



Pi 



c < 



log ^ + log 1 



1, 



1-Po 



theni>(c+l)-D(c)>0. 
(ii) If 



nlog 



c = 



Po 



Pi 

thenD(c+l)-D(c) = 0. 
(iii)If 



log ^ + log 1 Pl 



-1, 



1-A) 



nlog 



c > 



1-fi 
1-A) 



log ^ + log 1 



Pi 
1-Po 



-1, 



then D(c+ 1)-D(c)<0. 
Therefore, Let 



nlog 



1-A 
1-A 



log ^ + log 1 P[ 



-1, 



(12) 



(13) 



(14) 



(15) 



A 1 - A) 
then we have the following theorem. 
Theorem 1: 

(i) If y/ is not integer, there exists a unique c* that maximizes 
D{c) and c* is the minimum integer which is greater than . 

(ii) If is integer, there exist two c that maximizes D(c) and c 
are and + 1 . 
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V. Relation Between Statistical Model And Variation 
Distance Model 

This section reveals relation between the statistical model 
in Section III and the variation distance model in Section IV. 
d (F , F ) in Eq. (10) is expressed as following: 

d v (F, , F 2 ) = 2 - 2 (Pr[£, I p ] + Pr[Ii I Pl ] ) (16) 

using the producer's risk Pr[E l I p a ] of Eq. (1) and the 

consumer's risk Pr[Fi I pJofEq. (2). 

Therefore, maximizing the variation distance d (F lf F ) is 
equivalent to minimizing the sum of the producer's risk 

Pr[Fj I p ] and the consumer's risk Pr[Fi \ pj in the 

statistical model. This means to minimize the expected cost in 
case the cost of the producer's risk is equal to the cost of the 
consumer's risk. 

Vi. Numerical Examples 

Figure 1 shows numerical examples of variation distance 
d (F , F ) when acceptance number of input data sets causing 
software failures c varies in c = 0, 1, 2, 10 for n = 1000, 2000, 
3000, 4000, and 5000 where p = 0.001 and p l = 0.002. 

For each value of n = 1000, 2000, 3000, 4000, and 5000, the 
optimal values are c* = 1, 2, 4, 5, and 7, respectively. It is 
observed that c* increases and its corresponding variation 
distance increases with increasing n. 

VII. Conclusions 

This study proposed a model of reliability demonstration 
testing for discrete-type software such as software for 
numerical calculations. When unreliability of the software 
on the contract p Q and tolerable upper limit for unreliability of 
the software p 1 are given, the number of input data sets for 
test n and acceptance number of input data sets causing 
software failures in the test c are designed by maximizing 
d (Fj, F ). This model has less parameters to be prespecified 
than the statistical model. Theorem 1 reveals the existence of 
the optimal value c for each value of n. 
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Figure 1. Numerical examples of variation distance 
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